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Final Project: Heteroskedasticity 


Introduction 
This project analyzes via Monte Carlo simulation the performance of three different 


methods for dealing with heteroskedasticity. First, weighted least squares are used based on 
known variances, then weighted least squares are used based on unknown variances. Finally, the 
two WLS methods are compared with the use of robust standard errors found during the ordinary 
least squares process. All relevant code and command prompts not explicitly listed can be found 


in the Appendices. 


Data Generation Process 


First, 5000 random “independent” observations were created from a uniform distribution 
in Stata [Appendix A]. These were the experiment’s x-values. Subsequently, dependent-variable 
observations were created using varying degrees of heteroskedastic error functions. Error terms 
were randomly generated from a normal distribution a mean of zero (because heteroskedasticity 
does not bias our estimate of the parameter) and non-constant variance. The test equation was a 
single-variable, simple regression. The specifics of each heteroskedastic pattern are also listed 


below: 


Assumed regression equation: 
yi = Bi + Bo *xi + ei 


Patterns of heteroskedasticity: 
l. var(ei) = 077 = 0?" xi 


2. var(ei) = oi? = 02* xi? 
3. var(ei) = 0 = 02* x 
4. var(ei) = 02 = 02* Vxi 
5. var(ei) = 0? = 02* In(xi + 1) 
6. var(ei) = 0? = o?" [sin(xi) + 1] 


The true values of the parameters Bi and B2 were set equal to zero and one, respectively. For 
simplicity, the true value of o? was also set equal to one. 


Theory tells us that heteroskedasticity has two main implications. The first is that the 
least squares estimation procedure is still linear and unbiased, but it is no longer the best process 
of estimation. Second, the standard errors achieved through ordinary least squares estimation are 
wrong, meaning our t-tests and confidence intervals are not accurate. 

Hence, the focus of this investigation will primarily be on the standard errors of the B2 
parameter, and whether or not they lead us to reject the null hypothesis that the B2 parameter is 
equal to its true value (1). Because there are 5000 observations and the regression is estimating 
two parameters, all t-tests will have 4998 degrees of freedom. For a two-tailed test at the 5% 
significance level, this renders a critical t-value of +/- 1.96043870. 


[A] Weighted Least Squares Based on Known Variances (10,000 simulations) 


. summ Kb2* 


Variable | Obs Mean Std. Dev. Min Max 
EAS EEE Fete N PER a eee a IOIis. 
Kb2 1 | 10000 .9999314 -0070772 .9715984 1.026385 

Kb2 2 | 10000 .9999381 „0142252 „9455313 1.065772 

Kb2_3 | 10000 1.00001 „0173253 „9308503 1.063522 

Kb2_4 | 10000 1. 000105 „0060697 „9718258 1.026211 

Kb2_5 | 10000 .9999285 „0043244 „9829566 1.017783 

i Drd ca at Doi a rai Pic pg e pete E apa AA Ip Ap pi A oe Sos ose Se E cea sce lee pata pei 
Kb2_6 | 10000 „999938 „0053343 „9804298 1.021367 


Consistent with theory, the estimation of the B2 parameter appears to be unbiased even in 
the presence of heteroskedasticity. All six simulations are very close to having a mean of 1, 
which is the true value of the parameter. 

To test the efficacy of our standard error calculation, though, it is necessary to create a 
series of t-statistics from the data. 


. summ Kse2* 


Variable | Obs Mean Std. Dev. Min Max 
Seeseccacocece PEE EE E EE A Ra ep e A A AE ee ee 
Kse2 1 | 10000 .0070608 -0000703 .0068015 -0073441 
Kse2_2 | 10000 -0141662 -0001417 .0136043 .O147508 
Kse2 3 | 10000 „0172716 „0001742 „0166915 „0179458 
Kse2_4 | 10000 „0060622 „0000615 „0058516 „0062786 
Kse2_5 | 10000 „0042936 „0000432 „0041274 „0044714 

PWE KA Ea ER ape E E AR a a e a ON E YA oe eco pa pa ei 
Kse2_6 | 10000 „0053367 „0000533 „0051198 „0055473 


g Kb2 lt = (Kb2 1 - 1) / Kse2 1 
g Kb2 2t = (Kb2 2 - 1) / Kse2 2 
g Kb2 3t = (Kb2 3 - 1) / Kse2 3 
g Kb2 4t = (Kb2 4 - 1) / Kse2 4 
g Kb2 5t = (Kb2 5 - 1) / Kse2 5 
g Kb2 6t = (Kb2 6 - 1) / Kse2 6 


. count if (Kb2 lt >= 1.96043870) | (Kb2 lt <= -1.96043870) 
500 


. count if (Kb2 2t >= 1.96043870) | (Kb2 2t 
502 


A 
li 


-1.96043870) 
. count if (Kb2 3t >= 1.96043870) | (Kb2 3t <= -1.96043870) 
471 


. count if (Kb2 4t >= 1.96043870) | (Kb2 4t < 
491 


-1.96043870) 


. count if (Kb2 5t >= 1.96043870) | (Kb2 5t < 
499 


-1.96043870) 


. count if (Kb2 6t >= 1.96043870) | (Kb2 6t « 
513 


-1.96043870) 


Under the heteroskedastic error assumption of var(ei) = 6?“ xi, we achieved theoretically 
perfect results. Of our ten thousand simulations, exactly 5% would have led us to reject the null 
hypothesis that the true value of B2 is 1. In fact, all of the simulations seem to have produced 
reasonable standard errors. The largest deviation from what we would expect was for var(ei) = 
62* xi, which rejected the (true) null hypothesis 4.71% of the time. 


[B] Weighted Least Squares Based on Estimated Variances (10,000 simulations) 


. summ Ub2* 


Variable | Obs Mean Std. Dev. Min Max 

m n kn a aa Ri Pair IE NE e PE a Pe SEX a PN EEE 0 pi PE are E ls cee see a es 
Ub2_1 | 10000 „9999353 „0088545 „9690486 1.040536 

Ub2_2 | 10000 .9998172 .0189966 .9301063 1.073115 

ub2 3 | 10000 .9998518 .0394025 .8551449 1.143614 

Ub2_4 | 10000 -9998813 .0064553 .9717839 1.025519 

Ub2_5 | 10000 .9999789 .005465 .9796793 1.01945 

Se e m ak n po E ok n ol e n n po a e a n n N n Ia n n a a a 
Ub2_6 | 10000 .9999493 .0059335 .9772751 1.024626 

Ub2_7 | 10000 .9999837 -0345202 - 8645389 1.124931 


Again, estimates of the parameter of interest seem unbiased. They are very close to one. 


. summ Use2* 


Variable | Obs Mean Std. Dev. Min Max 

Poet ai DEI spe E Dat SIE Spot IS pp Oa i E a N E at ee i EN ape a i ee 
Use2_1 | 10000 „0100597 „0001073 „0096088 „010475 
Use2_2 | 10000 „0204653 „0002346 „0195657 „0212907 
Use2_3 | 10000 „0382149 „0009611 „03544 „042196 
Use2_4 | 10000 „006961 „0000719 „006704 „0072448 
Use2_5 | 10000 „006131 „0000644 „0058846 „006384 

E ape si Ei apa ap e We ZÈ EE PRE aa a pa hos pa popă UA ae AO A e Soe ke Mpa fa n n e ei 
Use2_6 | 10000 „0053437 „0000619 „0051163 „0055948 


On the whole, estimated standard errors from this simulation appear to be larger than 
those obtained by weighted least squares using known variances. The impact of this, though, can 
only be observed via t-tests: 


. g Ub2_1t = (Ub2 1 - 1) / Use2 1 
. g Ub2_2t = (Ub2 2 - 1) / Use2 2 
. g Ub2_3t = (Ub2 3 - 1) / Use2 3 
. g Ub2_4t = (Ub2 4 - 1) / uUse2 4 
. g Ub2_5t = (Ub2 5 - 1) / Use2 5 
. g Ub2_6t = (Ub2 6 - 1) / Use2 6 
. count if (Ub2 lt >= 1.96043870) | (Ub2_1t <= -1.96043870) 


256 


. count if (Ub2_2t >= 1.96043870) | (Ub2 2t <= -1.96043870) 
345 


. count if (Ub2 3t >= 1.96043870) | (Ub2 3t «= -1.96043870) 
550 


. count if (Ub2 4t >= 1.96043870) | (Ub2 4t <= -1.96043870) 
338 


. count if (Ub2 5t >= 1.96043870) | (Ub2 5t <= -1.96043870) 
266 


. count if (Ub2 6t >= 1.96043870) | (Ub2 6t <= -1.96043870) 
768 


At first glance, it seems that using estimated variances produces radically different results 
for our t-tests. Only in the case of var(ei) = 6?“ xi did the count of significant t-statistics near the 
number we would expect; it rejected 5.5% of results. 

Otherwise, the simulations suggest that the method of weighted least squares using 
estimated variances is too conservative in its rejection of the (true) null hypothesis. For var(ei) = 
o?* x; and var(ei) = 0?“ In(xi + 1), the null hypothesis was rejected only half as many times as 
we would expect from theory. For var(ei) = 0?“ x? and var(ei) = 0?" Vxi, slightly fewer than 
3.5% of the results were rejected. 

However, the glaring exception to the conclusion that weighted least squares based on 
known variances is too conservative lies in the simulation with a sinuous error variance term 
(var(ei) = o2* [sin(xi) + 1]). In this case, it doesn’t seem that the method of variance estimation 
was able to properly capture the (cyclical) nature of the error terms’ variances. As a consequence, 
substantially more results than we would expect from the simulation were rejected as having true 
values different from one. (Noteworthy is that this simulation was comparatively too liberal in its 
rejection of the null hypothesis for weighted least squares with known variances as well.) 


[C] Ordinary Least Squares with Robust Standard Errors (10,000 simulations) 


. summ RoB+ 


Variable | Obs Mean Std. Dev. Min Max 
ee ee +------------------------------ ---- a - -- -- == ---- + 
RoB1 | 10000 .9999235 .0077572 .973525 1.032184 
RoB2 | 10000 .9996687 „0221183 „914623 1.088525 
RoB3 | 10000 -9998195 -0647119 .7575594 1.259192 
RoB4 | 10000 1.000005 -0048954 .9817757 1.019442 
RoB5 | 10000 1.000052 „0042597 „9849213 1.018259 
ate Eat a de ma i fE Să E a e eg EEE a Sa eet pi e i n Se a pe a pa SIP A n n ee, 
RoB6 | 10000 „9999832 „0041302 „9841889 1.014413 
RoB7 | 10000 „9998409 „0244042 „9095108 1.097674 
„ Summ Rose* 
variable | Obs Mean Std. Dev. Min Max 
n n 3 a e S Pecan A BO Aa a A a n A e a in US oe See Seek eo ease se 
Rosel | 10000 .0077483 .000097 .0074013 „0081201 
Rose2 | 10000 „0219194 „0003146 „0207285 „0229809 
Rose3 | 10000 „0648196 „001016 „0606939 „0694729 
Rose4 | 10000 „0048866 „0000527 „0046728 „0050755 
Rose5 | 10000 „0042404 „0000464 „0040574 „0043922 
Dre Sarat eae e E a AIR E RN at E S Soe a on kt en e mani 
Rose6 | 10000 „0041722 „0000393 „004003 „0043218 
Rose7 | 10000 .O245702 -0002706 „023654 „0256004 


. g RoB1_t = (ROB1 - 1)/Rosel 
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(RoB2 - 1)/Rose2 


. g ROB3_t = (RoB3 - 1)/Rose3 
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(RoB4 - 1)/Rose4 


. g ROB5_t = (RoB5 - 1)/Rose5 
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(ROB6 - 1)/Rose6 


. g ROB7_t (RoB7 - 1)/Rose7 


. count if (RoB1_t >= 1.96043870) | (RoB1_t <= -1.96043870) 
515 


. count if (RoB2 t > 
487 


1.96043870) | (RoB2_t <= -1.96043870) 


. count if (RoB3_t > 
461 


1.96043870) | (RoB3_t <= -1.96043870) 


. count if (RoB4 t >= 1.96043870) | (RoB4 t <= -1.96043870) 
490 


. count if (RoB5 t >= 1.96043870) | (RoB5_t <= -1.96043870) 
496 


. count if (RoB6 t >= 1.96043870) | (RoB6 t <= -1.96043870) 
492 


. count if (RoB7 t >= 1.96043870) | (RoB7_t <= -1.96043870) 
482 


Robust standard errors seem to be much more conservative than their counterparts from 
the simulation using weighted least squares with known variances. All except var(ei) = 6?* xi 
were close to theoretically perfect, the largest deviation from 5% rejection being var(ei) = 62* 
x, just as it was in [A]. This suggests that exponential heteroskedastic error functions are dealt 
with more conservatively through all three methods. 

Robust standard errors also seem to be the most capable of dealing with sinuous 
heteroskedasticity. In this regard, they may be at least as accurate as weighted least squares with 
known variances, but they are certainly more accurate than weighted least squares with unknown 
variances. As noted above, this may have to do with the inability of variance function estimation 
to account for an error term with cyclical variance. 


Conclusion 


Weighted least squares with known variances appears to give the most accurate standard 
errors in the face of heteroskedasticity. However, it is not realistic to assume that we know the 
nature of a given error term’s variance function. Therefore, in practice, we are more likely to find 
ourselves using weighted least squares with estimated variances. The procedure by which the 
error term’s variance function is estimated seems to produce more conservative (larger) standard 
errors, as shown by the unexpectedly small number of estimates that were rejected at the 5% 
significance level. Also, this procedure did not seem capable of dealing with a sinuous error 
variance function; it produce too small of a standard error, leading to almost 8% of the 
simulation’s results being rejected in a t-test at the 5% significance level. 

Robust standard errors were less conservative than those obtained from weighted least 
squares with unknown variances, but they seemed to be more conservative than weighted least 
squares with known variances. (Although there isn’t substantial evidence to support this claim, it 
appeared as though having an error variance function with a larger exponent would produce 
more conservative estimates of a parameter’s standard error.) Finally, robust standard errors were 
able to deal with the aforementioned sinuous error variance function more effectively than 
weighted least squares with estimated variances. 


Appendix A: DGP Code and Stata Commands with Comments 


* Create 5000 randomly generated observations 
set obs 5000 
set seed 1000 
* Makes simulation replicable 


g x = 10*runiform() 

g varel = x 

g vare2 = x52 

g vare3 = x53 

g vare4 = sqrt(x) 

g vare5 = log(x+1) 

* "x+1" ensures nonnegative output 
g vare6 = sin(x) + 1 

* "+ 1" ensures nonnegative output 
g vare7 = 100*runiform() 

* a control for the "unknown" simulation, not used in report 


*load the do-file for known variances 


simulate  Kbl_l=r(Kbl_1) Ksel l=r(Ksel 1) Kb2 l=r(Kb2 l) Kse2 l=r(Kse2 l) 

Kbl 2=r(Kbl 2) Ksel 2=r(Ksel 2) Kb2 2=r(Kb2 2) Kse2 2=r(Kse2 2) Kbl 3=r(Kbl 3) 
Ksel 3=r(Ksel 3) Kb2 3=r(Kb2 3) Kse2 3=r(Kse2 3) Kbl 4=r(Kbl 4) Ksel 4=r(Ksel 4) 
Kb2 4=r(Kb2 4) Kse2 4=r(Kse2 4) Kbl 5=r(Kbl 5) Ksel 5=r(Ksel 5) Kb2 5=r(Kb2 5) 
Kse2 5=r(Kse2 5) Kbl 6=r(Kbl 6) Ksel 6=r(Ksel 6) Kb2 6=r(Kb2 6) Kse2 6=r(Kse2 6), 
reps (10000) :knownVarFin 


*load the do-file for unknown variances 


simulate Ubl_1 =r(Ubl_1) Usel 1 =r(Usel 1) Ub2_1 =r(Ub2_1) Use2 1 =r(Use2 1) Ubl 2 
=r(Ub1l 2) Usel 2 =r(Usel 2) Ub2 2 =r(uUb2 2) Use2 2 =r(Use2 2) Ubl 3=r(Ubl 3) Usel 3 
=r(Usel 3) Ub2 3=r(Ub2 3) Use2 3=r(Use2 3) Ubl 4=r(Ubl 4) Usel 4 =r(Usel 4) Ub2 4 
=r(Ub2 4) Use2 4=r(Use2 4) Ubl 5=r(Ubl1 5) Usel 5=r(Usel 5) Ub2 5=r(ub2 5) 

Use2 5=r(Use2 5) Ub1l 6=r(Ub1l 6) Usel 6=r(Usel 6) Ub2 6=r(Ub2 6) Use2 6=r(Use2 6) 
Ubl 7=r(Ub1l 7) Usel T=r(Usel 7) Ub2 T=r(Ub2 7) Use2 T=r(Use2 7), 

reps (10000) :unknownVar 


* All of the following is for use in robust standard error calculation: 
sum(x) 

g xdev2 = (x - r(mean))%*2 

* Squared deviations from the mean 


egen denom total (xdev2) 
replace denom = denom”2 
* Denominator of robust variance estimation 


g dof = 5000/ (5000-2) 
* Degrees of freedom adjustment 


*load the do-file for robust 
simulate RoBl=r(RoBl) Rosel=r(Rosel) RoB2=r(RoB2) Rose2=r(Rose2) RoB3=r(RoB3) 


Rose3=r(Rose3) RoB4=r(RoB4) Rose4=r(Rose4) RoB5=r(RoB5) Rose5=r(Rose5) RoB6=r(RoB6) 
Rose6=r(Rose6) ROB7=r(ROB7) Rose7=r(Rose7), reps(10000) :robust2 


Appendix B: Do-file for Weighted Least Squares with Known Variances 


program knownVarFin, rclass 


/* 
Brett Beutell 
Econ 312, Spring '12 


Monte Carlo simulation of weighted least squares 
with known variances 


*/ 


//Generate errors 


g el = rnormal(0,sqrt(varel) ) 
g e2 = rnormal(0,sqrt(vare2)) 
g e3 = rnormal(0,sqrt(vare3) ) 
g e4 = rnormal(0,sqrt(vare4) ) 
g e5 = rnormal(0,sqrt(vare5)) 
g e6 = rnormal(0,sqrt(vare6) ) 


//Generate y variables based off of random errors 


g ykl = 0 + 1*x + el 
g yk2 = O+ 1*x + e2 
g yk3 = 0 + 1*x + e3 
g yk4 = 0 + 1*x + e4 
g yk5 = 0 + 1*x + e5 
g yk6 = 0 + 1*x + e6 


//Generate transformed y-variable 


g ysl = yk1/sqrt(varel) 
g ys2 = yk2/sqrt(vare2) 
g ys3 = yk3/sqrt(vare3) 
g ys4 = yk4/sqrt(vare4) 
g ys5 = yk5/sqrt(vare5) 
g ys6 = yk6/sqrt(vare6) 


//Generate transformed x1 variable (result of dividing the constant) 
x1lsl = 1/sqrt(varel) 
xls2 = 1/sqrt(vare2) 
xls3 = 1/sqrt(vare3) 
xls4 = 1/sqrt(vare4) 
x1s5 = 1/sqrt(vare5) 
x1s6 = 1/sqrt(vare6) 


Q 


aaQaaea 


//Generate transformed x2 variable 
x2s1 = x/sqrt(varel) 
x2s2 = x/sqrt(vare2) 
x2s3 = x/sqrt(vare3) 
x2s4 = x/sqrt(vare4) 
x2s5 = x/sqrt(vare5) 
x2s6 = x/sqrt(vare6) 


Q 


(Cee Ee e e) 


//Regress transformed model with suppressed constant 
// Return scalars 
reg ysl xlsl x2s1, noc 
return scalar Kbl 1= b[x1ls1l] 


return scalar Ksel 1= se[x1ls1l] 
return scalar Kb2 1= b[x2s1l] 
return scalar Kse2 1= se[x2s1] 


reg ys2 xls2 x2s2, noc 
return scalar Kbl 2= b[x1s2] 
return scalar Ksel 2= se[x1s2] 
return scalar Kb2 2= b[x2s2] 
return scalar Kse2 2= se[x2s2] 


reg ys3 xls3 x2s3, noc 
return scalar Kbl 3= b[x1s3] 
return scalar Ksel 3= se[x1s3] 
return scalar Kb2 3= b[x2s3] 
return scalar Kse2 3= se[x2s3] 


reg ys4 xls4 x2s4, noc 
return scalar Kbl 4= b[x1s4] 
return scalar Ksel 4= se[x1s4] 
return scalar Kb2 4= b[x2s4] 
return scalar Kse2 4= se[x2s4] 


reg ys5 xls5 x2s5, noc 
return scalar Kbl 5= b[x1s5] 
return scalar Ksel 5= se[x1s5] 
return scalar Kb2 5= b[x2s5] 
return scalar Kse2 5= se[x2s5] 


reg ys6 xls6 x2s6, noc 
return scalar Kbl 6= b[x1s6] 
return scalar Ksel 6= se[x1s6] 
return scalar Kb2 6= b[x2s6] 
return scalar Kse2 6= se[x2s6] 


// Drop variables to allow for repetition 
drop e* 
drop y* 
drop x1l* 
drop x2+ 


end 


Appendix C: Do-file for Weighted Least Squares with Estimated Variances 


program unknownVar, rclass 


/* 
Brett Beutell 
Econ 312, Spring '12 


Monte Carlo simulation of weighted least squares 
with unknown, estimated variances 


*/ 


e = rnormal(0,1) 

el = rnormal(0O,sqrt(varel)) 
e2 = rnormal(0,sqrt(vare2) ) 
e3 = rnormal(0,sqrt(vare3) ) 
rnormal (0, sqrt(vare4) ) 
e5 = rnormal(0,sqrt(vare5) ) 
e6 = rnormal(0,sqrt(vare6) ) 
e7 = rnormal(0,sqrt(vare7) ) 


aeaeeanaaaaq 
Q 
rs 
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el 
e2 
e3 
e4 
e5 
e6 
e7 


1*x 
1*x 
1*x 
1*x 
1*x 
1*x 
1*x 


aaa 
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+++++++ 
+++++++ 


/* 

The following process will be repeated seven times for each degree of 
heteroskedasticity. 

Only the first regression equation will have comments, which should illuminate what is 
happening for each equation. 


*/ 


// Ordinary Least Squares Regression of y on x 
reg yukl x 


// Find the error terms from said regression 
predict ehatl, resid 


// Transform the error terms 
g 1n_ehat1Sq = log(ehat1”2) 


// Regress the transformed error terms on x 
reg ln_ehat1Sq x 


// Get the fitted values from said regression 
predict ln eVarHatl 


reg yuk2 x 

predict ehat2, resid 

g ln ehat2Sq = log(ehat2”2) 
reg ln ehat2Sq x 


predict ln eVarHat2 


reg yuk3 x 

predict ehat3, resid 

g ln ehat3Sq = log(ehat3%2) 
reg ln ehat3Sq x 

predict ln eVarHat3 


reg yuk4 x 

predict ehat4, resid 

g ln ehat4Sq = log(ehat4”2) 
reg ln ehat4Sq x 

predict ln eVarHat4 


reg yuk5 x 

predict ehat5, resid 

g ln ehat5Sq = log(ehat5`2) 
reg ln ehat5Sq x 

predict ln eVarHat5 


reg yuk6 x 

predict ehat6, resid 

g ln ehat6Sq = log(ehat6`2) 
reg ln ehat6Sq x 

predict ln eVarHat6 


reg yuk7 x 

predict ehat7, resid 

g ln ehat7Sq = log(ehat7”2) 
reg ln ehat7Sq x 

predict ln eVarHat7 


// Exponentiate the fitted values from above to obtain variance estimates 


eVarHatl = exp(In eVarHatl) 
eVarHat2 = exp(In eVarHat2) 
eVarHat3 = exp(In eVarHat3) 
eVarHat4 = exp(In eVarHat4) 
eVarHat5 = exp(In eVarHat5) 
eVarHat6 = exp(In eVarHat6) 
eVarHat7 = exp(In eVarHat7) 


aeeaaueaa 


// Create transformed y-variable for weighted least squares regression 


ysl = yukl/sqrt(eVarHatl) 
ys2 = yuk2/sqrt(eVarHat2) 
ys3 = yuk3/sqrt(eVarHat3) 
ys4 = yuk4/sqrt(eVarHat4) 
ys5 = yuk5/sqrt(eVarHat5) 
ys6 = yuk6/sqrt(eVarHat6) 
ys7 = yuk7/sqrt(eVarHat7) 


QaouQuovQoun 


// Create transformed xl-variable for weighted least squares regression 


xls1l = 1/sqrt(eVarHat1) 
xls2 = 1/sqrt(eVarHat2) 
x1s3 = 1/sqrt(eVarHat3) 
xls4 = 1/sqrt(eVarHat4) 
x1s5 = 1/sqrt(eVarHat5) 
x1s6 = 1/sqrt(eVarHat6) 
x1s7 = 1/sqrt(eVarHat7 ) 


aeeaauaa 


// Create transformed x2-variable for weighted least squares regression 


x2s1 = x/sqrt(eVarHatl) 
x2s2 = x/sqrt(eVarHat2) 
x2s3 = x/sqrt(eVarHat3) 
x2s4 = x/sqrt(eVarHat4) 
x2s5 = x/sqrt(eVarHat5) 
x2s6 = x/sqrt(eVarHat6) 
x2s7 = x/sqrt(eVarHat7 ) 


aeaeaaueaa 


// Regress transformed y-var on the transformed x-vars 
// Suppress the constant because it is estimated by xls variables 


reg ysl xlsl x2s1, noc 
return scalar Ub1_1=_b[x1s1] 
return scalar Usel 1= se[x1ls1] 
return scalar Ub2 1= b[x2s1l] 
return scalar Use2 1= se[x2s1] 


reg ys2 xls2 x2s2, noc 
return scalar Ubl 2= b[x1s2] 
return scalar Usel 2= se[x1s2] 
return scalar Ub2 2= b[x2s2] 
return scalar Use2 2= se[x2s2] 


reg ys3 xls3 x2s3, noc 
return scalar Ubl 3= b[x1s3] 
return scalar Usel 3= se[x1s3] 
return scalar Ub2 3= b[x2s3] 
return scalar Use2 3= se[x2s3] 


reg ys4 xls4 x2s4, noc 
return scalar Ubl 4= b[x1s4] 
return scalar Usel 4= se[x1s4] 
return scalar Ub2 4= b[x2s4] 
return scalar Use2 4= se[x2s4] 


reg ys5 xls5 x2s5, noc 
return scalar Ubl 5= b[x1s5] 
return scalar Usel 5= se[x1s5] 
return scalar Ub2 5= b[x2s5] 
return scalar Use2 5= se[x2s5] 


reg ys6 xls6 x2s6, noc 
return scalar Ubl 6= b[x1s6] 


return scalar Usel 6= se[x1s6] 
return scalar Ub2 6= b[x2s6] 
return scalar Use2 6= se[x2s6] 


reg ys7 xls7 x2s7, noc 
return scalar Ubl 7= b[x1s7] 
return scalar Usel 7= se[x1s7] 
return scalar Ub2 7= b[x2s7] 
return scalar Use2 7= se[x2s7] 


// Lastly, just have to drop a lot of variables 
drop e* 
drop y* 
drop ln+ 
drop x1* 
drop x2+ 
end 


Appendix D: Do-file for Robust Standard Errors via Ordinary Least Squares 


program robust2, rclass 


/* 
Brett Beutell 
Econ 312, Spring '12 


Monte Carlo simulation of ordinary least squares 
with robust standard errors 


*/ 

// Data Generation 
g el = rnormal(0,sqrt(varel) ) 
g e2 = rnormal(0,sqrt(vare2)) 
g e3 = rnormal(0,sqrt(vare3) ) 
g e4 = rnormal(0,sqrt(vare4) ) 
g e5 = rnormal(0,sqrt(vare5)) 
g e6 = rnormal(0,sqrt(vare6)) 
g e7 = rnormal(0,sqrt(vare7) ) 
g yrl = 0 + 1*x + el 
g yr2 = 0 + 1*x + e2 
g yr3 = 0 + 1*x + e3 
g yr4 = 0 + 1*x + e4 
g yr5 = 0 + 1*x + e5 
g yr6 = 0 + 1*x + e6 
g yr7 = 0 + 1*x + e7 


// OLS regression with robust standard errors, record results 
// I trust Stata’s robust standard error calculation much more than my own 


reg yrl x, vce(r) 
return scalar RoBl = _b[x] 
return scalar Rosel = _se[x] 


reg yr2 x, vce(r) 
return scalar RoB2 = _b[x] 
return scalar Rose2 = _se[x] 


reg yr3 x, vce(r) 
return scalar RoB3 = _b[x] 
return scalar Rose3 = _se[x] 


reg yr4 x, vce(r) 
return scalar RoB4 = _b[x] 
return scalar Rose4 = _se[x] 


reg yr5 x, vce(r) 
return scalar RoB5 = _b[x] 
return scalar Rose5 = _se[x] 


reg yr6 x, vce(r) 
return scalar RoB6 = _b[x] 
return scalar Rose6 = se[x] 


reg yr7 x, vce(r) 
return scalar RoB7 = _b[x] 
return scalar Rose7 = _se[x] 


// Drop variables 
drop e* 
drop y* 


end 


